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Abstract. We consider probability measures on R°° and study natural analogs 
of optimal transportation mappings for the case of infinite Kantorovich dis- 
tance. Our examples include 1) quasi-product measures, 2) measures with 
certain symmetric properties, in particular, exchangeable and stationary mea- 
sures. It turns out that the existence problem for optimal transportation is 
closely related to various ergodic properties. We prove the existence of optimal 
transportation for a certain class of stationary Gibbs measures. In addition, 
we establish a variant of the Kantorovich duality for the Monge— Kantorovich 
problem restricted to the case of measures invariant with respect of actions of 
compact groups. 



We prove the existence of optimal transportation mappings for certain classes of 
measures on R°° . The optimal transportation mappings in finite-dimensional spaces 
can be constructed as solutions to the (quadratic) Monge-Kantorovich problem. 
Given a couple of probability measures fi and v on R d with Lebesgue densities, the 
corresponding optimal transportation mapping T: R d — > R d gives a minimum to 
the functional 



among all mappings T: M. d — > R d transforming /j, onto v (here || ■ || is the standard 
Euclidean norm). It turns out that T has the form T(x) = Vip(x), where <p is a 
convex function. 

The standard existence proof relies on the existence of the solution to the fol- 
lowing problem for measures: find minimum of the functional 



on the space P(/i, v) of probability measures with fixed projections: Pr x m — 
/it, Pr y m = v. This problem is called the Monge-Kantorovich problem. Having 
a solution m to the Monge-Kantorovich problem, one can easily reconstruct the 
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1. Introduction 




VU|(^, v) — vail j \\x - y\\ 2 dm: m e P([i,v) 




desired mapping T. It turns out that the optimal measure m is supported on the 
graph of T: 

m(r) = 1, where T = {(x,T(x)), x G R d }. 

The functional W%{jJ,, v) is a distance in the space of probability measures. In what 
follows we call it the Kantorovich distance. 

Another well-known fact which will be used throughout the paper is the following 
relation called the Kantorovich duality: 

Wi{n,v) = J(<p,ip), 

where 

J((p,ip) = mf| J (tp(x) - y) d V + J (ip(y) - y) dv, <p(x) + ip(y) > (x,y)}, 

where the supremum is taken over couples of integrable Borel functions f{x), ip{y). 
Note that the function ip in the dual problem coincides with the potential generating 
the transportation mapping: T = Vy>. 

The mapping T exists under quite broad assumptions. It is sufficient that jj, 
and v have densities and admit finite second moments (see [TB]). In this case the 
existence of <p and T follows immediately from the existence of the solution to the 
dual Kantorovich problem. More on optimal transportation can be found in [I], 

The situation in the infinite-dimensional case is still not well-understood. The 
main reason for this is the fact that natural norms associated with measures are 
infinite almost everywhere. The archetypical example is given by the Cameron- 
Martin norm of a Gaussian measure. Nevertheless, for certain couples of measures 
the transportation problem has a natural formulation and a unique solution. 

2 

Example 1.1. Let 7 = J\°l 1 7« = II Si yW e_ ^" ^ e standard Gaussian 
product measure on R°° and H = I 2 , \\x\\ 2 H = x 1 be the corresponding 

Cameron-Martin space. More generally, one can consider any abstract Wiener 
space. 

The optimal transportation problem is well-understood for the case of measures 
fj, and v which are absolutely continuous with respect to 7. The most general 
results were obtained in [11] (another approach has been developed in [12]). In 
particular, for any given probability measure / • 7 there exists a transportation 
mapping T{x) — x + W<p{x) minimizing the cost 

J \\T{x)-x\\l d 7 

and transforming 7 onto f • r y, provided J /log / dj < 00. Analogously, there exists 
a transportation mapping transforming / • 7 onto 7. 

It is known (this follows from the so-called Talagrand transportation inequality) 
that under assumption J f log / dj < 00 the Kantorovich distance between 7 and 
/ ■ 7 is finite 

W 2 2 ( 7 ,/-7) = J \\T{x)-x\\l d 1 <^ 1 . 

In particular, Wfix) £ I 2 for 7-almost all x. More on optimal transportation on the 
Wiener space, the corresponding Monge-Ampere equation, regularity issues, and 
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transportation on other infinite-dimensional spaces can be found in [3] , [5] , [7] , [5] , 
[TO], and 0. 

We state now the central problem of this paper. 

Problem 1.2. Let /i and v be two probability measures on M°° . When does there 
exist a transportation mapping T transforming \x onto v which is "optimal" for the 
cost function c(x,y) = \\x — y 1 1 ^ 2 ? 

Note that we don't assume finiteness of the Kantorovich distance be- 
tween the measures. Of course, it makes impossible in general to understand T as 
a solution to a certain minimization problem. Nevertheless, we have many good 
candidates to be called "optimal transportation" in many particular cases. The 
following example motivates our study. 

Example 1.3. 1) Let fi = YiiZi ^i{dxi) 1 v = fl^i Viidxi) be product probability 
measures. Assume that fii,Vi have densities. Then there exists a mass transporta- 
tion mapping T taking fi onto v which has the form 

T(x) = (Ti(xi), • • • ,Ti(xi), ■■■), 

where T n (x n ) is the one-dimensional optimal transportation transforming fii onto 

Vi. 

2) Let us consider the Gaussian measure /i obtained from the standard Gaussian 
measure 7 by a linear mapping T(x) — Ax with A symmetric and positive. It 
is known (and can be obtained from the law of large numbers) that 7 and /u are 
mutually singular even in the simplest case A = 2 • Id. T is "optimal" because it is 
linear and given by a positive symmetric operator. Heuristically, 

T(x) = ±V{Ax,x). 

It is clear that in both cases T cannot be obtained as a minimizer of a functional 
of the type / \\T(x) - x\\f 2 dfi. 

We fix the standard basis {ei}, ej = (<5y) in M°°. Denote by K™ the subspace of 
K°° generated by {ei, • • • , e„} and by P n the orthogonal projection onto R™. 

In this paper we consider two (eventually, non-equivalent) definitions of the 
Mongc-Kantorovich optimal mappings in the infinite-dimensional case: 

Dl) limits of finite-dimensional optimal mappings, 

D2) (in the symmetric case) solutions to the classical Monge problem for another 
(finite) cost function constrained to a set of symmetric measures. 
Almost everywhere in this paper we use approach Dl). Let us briefly explain D2). 

It is possible to give a meaning to the Monge-Kantorovich optimization problem 
if we restrict ourselves to a certain class of symmetric measures. In this paper we 
consider two types of symmetry: exchangeable measures (invariant with respect to 
finite permutations of coordinates) and stationary measures on ]R Z (invariant with 
respect to shifts of coordinates). Note that — y\\f 2 is symmetric with respect 
to both types of symmetry. More generally, let G be a group of linear operators 
which acts on X = Y = M°° and X x Y: x — > gx, (x,y) — > (gx,gy), g £ G and 
preserves the cost function c(x,y). We assume that every basic vector ej can be 
obtained from every other by the action of this group: there exists g € G such 
that ei = gej . Note that under these assumptions all the coordinates are identically 
distributed. This leads us to the following definition: given G-invariant marginals fi 
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and v we call 7r an optimal solution to the Monge-Kantorovich problem if 7r solves 
the Monge-Kantorovich problem 

J ( x i ~ Hi) 2 dn — > min 

among all measures which are invariant with respect to G. If there exists a mapping 
T such that its graph Y = {x, T(x)} satisfies m(G) — 1, we say that T is an optimal 
transportation mapping transforming /i onto v. 

Remark 1.4. 1) In fact, we will use definition Dl) throughout the paper (more 
precisely, Definition II. 6p . See, however, Section 6. 

2) The existence of a solution to the symmetric Monge-Kantorovich problem 
can be established by standard compactness arguments. 

3) The corresponding optimal transportation (if exists) must preserve G- invariant 
sets. This observation allows us to construct the following counter-example 
to the existence Problem II .21 See also Example 16.41 

Example 1.5. Let /i = 7 be the standard Gaussian measure on R°° and 

v = ^(7 + 72) 

be the average of 7 and its homothetic image 72 = 7o5~ 1 , where S(x) — 2x. There 
is no any mass transportation T of /i to v which preserves "rotational invariance" 
or even exchangeability of a set (i.e., if A is invariant with respect to cylindrical 
rotations, then T(A) is invariant too). Indeed, any mapping of such a type must 
have the form T(x) — g{x)(x\, x%, ■ ■ • ) = g(x) ■ x, where g is invariant with respect 
to any "rotation", in particular, with respect to any coordinate permutation. But 
any function g of this type is constant 7-a.e. This is a corollary of the Hewitt- 
Savage — 1 law. It is clear that there is no any mass transportation of this type 
for the given target measure. 

There is a general principle behind this simple example. Recall that a measure 
fi is called ergodic with respect to a group action G, if for every G-invariant set A 
one has either fi(A) = 1 or fi(A) = 0. It follows directly from the definition that 
there is no any bijective mass transportation T transforming fj, onto v , such that 
T(A) is G-invariant for any G-invariant set A, provided /1 is ergodic but v is not. 

Definition 1.6. We say that a measurable mapping T transforming fi onto v is 
optimal if the measure n = fio (x, T(x))^ 1 on the graph 

{(x,T(x))\ x G supp(/i)} C K°° x M°°' 

of T can be obtained as a weak limit of probability measures 7r„ such that 1) the 
support of 7r„ is contained in M" x E™, 2) 7r„ is a solution of a finite-dimensional 
Kantorovich problem with quadratic cost Ylf—i \xi — yi\ 2 ■ 

Let us summarize the main results obtained in this paper. All of them are 
applicable to a partial case when v = 7, where 7 is the standard Gaussian measure 
(or, more generally, v is a uniformly log-concave product measure). This very 
special case of the target measure is important for applications (see, for example, 
[T5]). In Section 3 we give some general sufficient conditions for the existence of 
transportation mappings. We prove existence in the following cases: 

CI) fj, is a quasi-product measure, i.e. /i has a density with respect to some 
product measure (Section 4), 
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C2) n is exchangeable (Section 6), 
C3) fi is stationary (Section 7). 

The transportation in the quasi-product case CI) can be viewed as a perturbation 
of the "diagonal" transportation described in Example II .31 More precisely, if fj, and 
v have densities with respect to (different) product measures P and Q, then the 
optimal transportation T transforming /i onto v has the form 

T = T + f , 

where To is the transportation of P onto Q described in Example 11.31 and T is 
"small" compared to To. This result generalizes the results on the Wiener space 
obtained is [TT]. [12]. 

We use in C2) a De Finetti-type decomposition theorem, representing exchange- 
able measures as averages of countable powers of one-dimensional measures: 

/i= ym°°(B)n AI (dTO), v = J m°°(B)U v (dm), 

where m belongs to the space V(M) of Borel probablity measures on R and LI^, H u 
are probability measures on P(R). We show that the problem of existence of an 
optimal transportation (in the sense of D2)) transforming /i onto v is reduced 
(in a sense) to the optimal transportation problem for 11^,11^ with the squared 
Kantorovich distance on 'P(R) as the cost function. 

The proof in C3) follows the ideas from [12]. We apply a Talagrand-type estimate 
of the _L 2 -distance between transportation mappings via the relative entropy of the 
corresponding measures. For any probability measures /i = e~ v dx, v = e~ w dx 
on R d and the corresponding optimal transportation mappings T„, T u , taking fi, 
v onto the standard Gaussian measure on R d , respectively, the following estimate 
holds: 

E*rt„(£) = j\og^dti>\j ||VT M - T u \\ 2 dp. 

The main example of Section 6 is given by a Gibbs measure on the lattice R z with 
the following formal shift-invariant Hamiltonian: 

oo 

H= V(xi) + W(xi,x i+1 ). 

i— — oo 

The existence results for such measures can be found in [2]. 

In addition, we establish a variant of the Kantorovich duality for the Monge- 
Kantorovich problem restricted to the space of measures invariant with respect of 
actions of a compact group (Section 5). Unfortunately, we don't know so far, what 
is an adequate generalization of this statement for non-compact groups (which is 
the most interesting case). 

2. Some preliminary estimates 

Let n and v be probability measures on R d and T(x) — Vi^(i) be the optimal 
transportation transforming fi onto v. Let us denote by \i v the images of /x under 
the shifts x h-» x + v, v S R d . 

It will be assumed throughout that fj, v have densities with respect to /z: 



Lemma 2.1. For every p,q > 1 with i + | = 1, e > 0, and e G M d 

+ a><* 1+e || Kz,e)| 1+e |U„ (!/r sup ||e^|| i9(M) . 



0<s<t 

[<p(x + te) - ip(x) - td e (p(x)) dfi < t\\(x,e)\\ L p( u ) ■ sup \\e 0sa - 

0<s<t 

Proof. One has p{x + te) — f{x) — f Q d e ip(x + se) ds. Hence 
J\ip(x + te) - (p(x)\ 1+e dfi < fj J \d e p\ 1+E (x + se) ds dfi 



t E 



[ \ [ \dM 1+£ z Ps < dfi] ds<t 1+ '\\\d eV \ 1+E \\ LP(M y sup 

JO L J J 0<s<t 



= i 1+£ || |(a;,e)| 1+£ |U PM . sup ||e""|| is(/t) . 

0<s<t 

Applying the same arguments one gets 

(tp(x + te) — (fi{x) — td e <p(x)) dfi — J J (d e ip(x + se) — d e (p(x)) ds dpi 

t [ft 1 
(e^ sc — 1) ds d e (f(x) dfi < t * \\d e tp\\ LP ( p } / / \e^ s " — l\ q ds dfi 



The desired estimate follows from the the change of variables formula and trivial 
uniform bounds. □ 

We recall that a probability measure \x on M. d is called log-concave if it has 
the form e~ v ■ % k \h, where T~L k is the /c-dimensional Haussdorff measure, k G 
{0, 1, • • • , d}, L is an afHne subspace, and V is a convex function. We call a mea- 
sure fi uniformly log-concave (more precisely, X-uniformly log-concave measure) if 
J^ e K \ x \ . is a log-concave measure for some K > and a suitable renormalization 
factor Z. It is well-known (C. Borell) that the projections and conditional measures 
of log-concave measures are log-concave. The same holds for uniformly log-concave 
measures. We can extend this notion to the infinite-dimensional case. Namely, we 
call a probability measure fi on a locally convex space X log-concave (i-T-uniformly 
log-concave with K > 0) if its images fi o £ , I g X* under linear continuous 
functionals fi are all log-concave (if-uniformly log-concave with K > 0). 

Throughout the paper we apply the following estimate (see [12] . [13j). 

Proposition 2.2. Let m be a K -uniformly log-concave probability measure with 
some K > 0. Then for any couple of probability measures fi = e~ v dx. v — e~ w dx 
and the corresponding optimal mappings Vy^, Vyv, transforming fi, v onto m 
respectively, one has the following estimate 

Ent„(^) = J log ^ dfi = f(W-V) d/x > y J (V^ - V^,) 2 dfi. 

The quantity Ent^ ( is called the relative entropy or Kullback-Leibler distance 
between fi and v. 

In addition, we will apply the following elementary Lemma. 
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Lemma 2.3. Assume that a sequence {T n } of measurable mappings T n : M°° — > R°° 
converges to a mapping T in the following sense: for every lim„(T„, ef) = (T, e») 
m measure with respect to \i. Then the measures {[i o T^ 1 } converge weakly to 
fioT- 1 . 

3. Sufficient conditions for existence 

We consider a couple of Borel probability measures \i and v on R°° , where 
]R°° is the space of all real sequences: M°° = nSi^»- We deal w ^ n the stan- 
dard coordinate system x — (xi,X2, m • • ,i n ,'") an d the standard basis vectors 
Ci = (Sij). The projection on the first n coordinates will be denoted by P n : 
P n (x) = (xi, ■ ■ ■ ,x n ). We use notations ||a;||, (x,y) for the Hilbert space norm and 
inner product: ||a:|| = J^Si x i' ( x > V) = x iVi- We use notation IE™ for the con- 

ditional expectation with respect to \x and the cr-algebra generated by x\, ■ ■ ■ ,x n . 
For any product measure P = Y\°Z 1 Pi{xi) dxi its projection P n = P o P^ 1 has the 
form fliLi Pi( x i) dxi and the projection (/ • P) o P^ 1 — f n ■ P n of the measure / • P 
satisfies /„ = IEp/. Everywhere below we agree that every cylindrical function 
/ = f(xi, ■ ■ ■ , x n ) can be extended to M°° by the formula x — > f n (P n x). 

It will be assumed throughout the paper that the shifts of n along any vector 
v = te-i are absolutely continuous with respect to /i: 

In Section 3, moreover, the following assumption holds. 

Assumption (A). For every basic vector e = e* there exist p > 1, q > 1, 
satisfying - + - = 1, and e > such that 



/ 



|(a;,e}| (1+e)p dv < 



oo 



and 



p{t) = sup / le- 3 " - l\ q du 

0<s<t , 



satisfies \im t ^o p(t) = 0. 

Let fi n = n o P J ^ 1 (x), j/„ = f o P~ 1 (y) be the projections of yU, For every 
w = tei let us set 

d{n n )v _ pM 
d]X n 

It is easy to check that the projections of fi, v satisfy Assumption (A). 
Lemma 3.1. For every n <G N and every e = ej one has 

J \{P n (x),e)\* dv n < J\{x,e)\*du, J - 1|« d^ < J \e?° - 1\" dfi. 

Proof. The first estimate is trivial. To prove the second one, let us note that 
e@v n) = IE™e^". The claim follows from the Jensen inequality and convexity of the 
function t ->■ \t - l\ q . □ 

We denote by 7r„ the optimal transportation plan for the couple (p-m^n)- Let 
(f n (x) and ip n {y) solve the dual Kantorovich problem. Let us recall that V(f n (Vip n ) 
is the optimal transportation mapping sending [i to v {y to fi). One has 

ip n (x) +ip n (y) > {P n x,P n y) 
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for every x,y. The equality is attained on the support of n n . In particular, 

ip n (x) + 1p n (Vip n {x)) = (P n X, V<p n (x)). 

It is easy to check that {7r„} is a tight sequence. By the Prokhorov theorem one 
can extract a weakly convergent subsequence 7r nj . — > tt. Note that 7r„ is not the 
projection of tt. 

In what follows we will pass several time to subsequences and use for the new 
subsequences the same index n again, with the agreement that n takes values in 
another infinite set N' C N. Let us fix unit vectors e,,ej for some i,j £ N and 
consider the following sequence of non- negative functions: 

F n (x,y,t,s) = tp n (x + tei) + ip n {v + sej) - (P n (x + te^), P n (y + sej)) 

with n > i,n > j. 

Lemma 3.2. There exists a L +£ (ir) -weakly convergent subsequence 

<fin k (x +te l ) - (pn k (x) -> U(x). 

The following relation holds for the limiting function U (x) : 



U(x) dfi-t J (y,ei) dv < Ctp(t). 
Proof. Taking into account that J F n (x, y, 0, 0) dir n = 0, one obtains 



F n (x,y,t,0) dn n = J F n (x,y,t,0) dn n - J F n (x,y, 0,0) dn n > 0. 
Note that the right-hand side equals 

(F n (x,y,t,0) - F n (x,y,t,0)) dn n = / [<p n (x + tei) - <Pn(x) ~ t(y,ei)] dn n . 



Taking into account that the projection of 7r„ onto X coincides with fi n and <p n 
depends on the first n coordinates, one finally obtains that for n > i the latter is 
equal to 



[<Pn(x + tei)-tp n (x)] dfi-t j (y,ei) dv = J [(p n (x + tei)-cp n (x)-td ei ip n (x)] dfi 
It follows from Lemma 12. 1[ Lemma 13.11 and Assumption (A) that 



(1) 



F n (x,y,t,0) dir n < Ctp(t). 



Since ip n depends on a finite number of coordinates, one has J \(p n (x + te,{) — 
ip n (x)\ 1+£ dfjL = f \tp n (x + tei) — tp n (x)\ 1+£ dpi n . Hence by Lemma [2~T1 

U n (x) = (p n (x + tei) ~ <Pn{x) £ L 1+E (n) 

and, moreover, sup ra ||C/n||i 1 + c (^) < oo. Thus there exists function U £ 
such that for some subsequence 

Vn k (X + tei) - <Pn k (x) -> U (x) 

weakly in L 1+£ (n). Passing to the limit we obtain from ((T|) that 

U(x) dfi-tj (y,ei) dv < Ctp(t). 

□ 



Lemma 3.3. Assume that F n (x, y, 0, 0) — > in measure with respect to ir. Then 

U(x)-t(y,ei) > 

for ir -almost all (x,y). 
Proof. Note that 

[cp n (x+tei)-ip n (x)-t(y, e H )] +F n (x, y, 0, 0) = (p n (x+tei)+ip n (y)-(P n y, P n (x+tei)) 

is a non- negative function for every n. Since F n {x, y, 0, 0) — > in measure, there 
exists a subsequence (denoted again by F n ) which converges to zero 7r-almost ev- 
erywhere. Since /„ = ip n (x + tei) — ip n (x) — i(y, e^} converges to / = U(x) — 
t(y,ei) weakly in L 1+£ (ir), one can assume (passing again to a subsequence) that 
~k Sn=i fn~*f T-a.e. Since /„ + F n > 0, this implies that / > 7r-a.e. □ 

Proposition 3.4. Assume that there exists a sequence of continuous functions 
fn(,Xi , • • • , x n )t g n 

such that G n = f n {x) + g n (y) — Si=i x iVi has the following properties: 

1) G n > 0, 

2) G n <G m , V n<m,x,y £W n , 

3) sup„ / G n dir n < oo. 

TTiera F„(x,y,0,0) ->■ m i^ 71 ")- 

Proof. We start with the identity J F n (x, y, 0, 0) d7r„ = and rewrite it in the 
following way: 

(2) = J(ip n - f n ) dfi + J(ip n ~ g n ) dp + J (f n (x) + g n (y) - x^) dTT„. 

i—l 

Since ip n , ip n are defined up to a constant, one can assume that J (ip n — g n ) dv = 0. 
Thus - / (ip n - f n ) d[i = J (f n {x) + g n (y) ~ J2i=i x iVi) d^n- It follows from 2) and 
3) that both sides have limits. It follows from the weak convergence 7r„ — ¥ ir and 
the monotonicity property 2) that for every k 

/Th ft 
{fn(x) +g n {y) - ^xty,) d-Kn > hm„ / (fk(x) + gk(y) - J^x^) dn n 
i=i i=i 

= J \fk{x) +gk{y) -^Txiyt) dir. 



i=l 



Hence 



p T}, ft /£ 

liffin / {fn(x) +g n (y) - ^Xiy,) d-Kn > hm / (f k (x) +g k (y) - ^XiVi) dn, 

i—l ' i—l 

where the limit in the right-hand side exists, because the sequence is monotone. 
Hence we get from © 

> lim J (<p n - f n ) dfj, + lim J (/„(#) + g n [y) - Xiy^dn. 



Taking into account that J g n dir = J g n dv — J ip n dv = J ip n dir, we obtain 

> lim / ((p n - f n )(x) d\i + lim / (f n (x) + g n (y) - x^dn 
n n J i=i 

~ n 

= limf / {(p n (x) + 1pn{y) - ^2 X iVi) dlT ) - °- 

i—1 

The proof is complete. □ 

Finally, we obtain a sufficient condition for the existence of an optimal mapping 
in the infinite-dimensional case. 

Proposition 3.5. Assume that F n (x,y, 0,0) — » in measure with respect to n. 
Then there exists a mapping T : R°° — > R°° such that 

T(x) = y 

for it -almost all (x,y). 

Proof. Let us fix and choose a sequence of numbers t n — > 0. We get from Lemma 
and Lemma 13.31 that there exist 7r-a.e. nonnegative functions Ut n {x) — t n (y, ef) 



with f(U t „(x) - t„{y,ei)) dn = o(t n ). Hence, lim tii ^ / ~ {Ui e i)) dir = 0. 
Taking into account that Ut ™ — — {y, e^) > for 7r-almost all {x, y), we conclude that 
Ut ™ — converges /i-a.e. and in L 1 ^) to a function Ui(x) satisfying Ui{x) — (y, ef) > 
and J(ui(x) — (y,&i)) dir — 0. Clearly, u{x) = (y,ei) for 7r-almost all (x,y). 
Repeating these arguments for every ieN, we get the claim. □ 



4. QUASI-PRODUCT CASE 

In what follows we consider two product measures 

oo 

P = \\_Pi(xi) dxi 

i=l 

and 

oo 

Q = Y\_1i(xi)dxi. 

Measures which have densities with respect to a product measure will be called 
quasi-product measures. 
Let 

T(x) = (Ti(^i), • ■ ■ ,T n (x n ), ■■■) 

be the infinite-dimensional the diagonal transportation mapping, where Ti(xi) trans- 
forming pi(xi)dxi onto qi(xi) dxi. Clearly, T takes P onto Q. The inverse mapping 
S = T _1 has the same structure: 

S(x) = {Si{xi), • • • , S n (x„), ■■■). 

Theorem 4.1. Let /i = f ■ P and v = g ■ Q be probability measures satisfying the 
Assumption (A) of the previous section. Assume, in addition, that 

1) there exists K > such that every i>i is K -uniformly log- concave; 
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2) there exists M > such that 

S'i(xi) < M; 

uniformly in i and x; 

3) there exist c, C such that < c < g < C; 
4) 

/log/ei^P). 

T/ien t/iere exists a optimal transportation mapping T transforming /i onto v. 

Remark 4.2. It follows from Caffarelli's contraction theorem (see [14]) that assump- 
tion 2) is satisfied if (—\ogpi(xi))" > Co, (— \ogqi(xi))" < C\ for some Cq,C\ > 
and every i. Of course, there exist many other examples when this assumption is 
satisfied. 

Proof. Consider the hnite-dimensional projections /i„ = f n -P ni v n = g n -Qm where 
Pn = TYi=iPi{xi) dxi, Q n = Y[- = iqi(xi) dxi. Here /„ and g n are the conditional 
expectations of /, g with respect to P, Q and the a- algebra T n . Recall that V(p n 
is the optimal transportation of (x n to v n . Let Ui{xi),Vi{yi) = u* be the one- 
dimensional potentials associated to the mappings X^, Si, respectively: Ti = u[, Si = 
v[. Note that T n = (T\, ■ ■ ■ , T n ) maps P n onto Q n and S7(p n maps - -M^ ; • P n onto 



-dPn. 



According to Proposition 12.21 one has the following estimate: 

(3) f / \f n -^ n \ 2 —^-dP n < f l og (-^—)^- 

Applying uniform bounds on g one easily gets that c < g n < C uniformly in n. It 
follows from the Jensen inequality that 

(4) J f n \ogf n dP< J f log fdP 
Clearly, ©, (gj imply that 

sup / \f n - V(p n \ 2 f n dP n = sup / |T„ - V^„| 2 d^i < oo. 

n J n J 



We complete the proof by applying Proposition ^. 51 to the sequences of functions 
J2i=i Ui(xi), J2i=i v i{Vi)- We need to estimate Yn=i I {ui{Xi) + v^yi) - x l y i ) dir n . 
Taking into account that 7r„ is supported on the graph of Vy n , and the relation 
Ui(xi) + Vi(Ti(x)) — XiTi(x) we obtain that the latter equals to 



(ui(xi) + Vi(d Xi ip n ) - Xid Xi <p„(x)) dn n 

Vi(d Xi (p n (x)) - Vi{Ti(x)) - Xi(d Xi (p n (x) - Ti(x)) dfx n 

Vi(d Xi ip n (x)) - v l {T i {x)) - v'^T^x^d^ip^x) - Ti(x)) d/j,„ 

< C J (d Xi (p n (x) - Ti) 2 dfx n . 
Here we use the uniform bound v" = S[ < M . Finally, we obtain that 

^ j (Ui(xi) + Vi(yi) - Xiyi) dn n < M J \V<p n - T„| 2 dfj, n . 



li 



We have already shown that the right-hand side is uniformly bounded in n. The 
result follows from Proposition ^. 51 □ 



5. Kantorovich duality in the class of measures with compact 

invariance group 

In this section we start to study measures which are invariant under actions of 
some group. 

We begin with the most favorable situation: compact spaces and groups. The 
result of this section will not be used in this paper, but it is of independent interest. 

Let X, Y be compact metric spaces, G be a compact group with bijective contin- 
uous actions L x and L Y on X, Y respectively. The action L on the product space 
X x Y is defined as follows: 

L g (XxY)=Lf(X)xLj(Y). 

Let \x and v be Borel probability measures which are invariant under the actions 
L x and L Y respectively: /i, v G Itivl < 5=> /U° = //, vo (L^)^ 1 = v. We fix 

a non-negative and lower-semicontinuous cost function c: Ix7->K. Denote by II 
the set of all non-negative Borel probability measures onlxY with the property 
7r G II <S=> tt = fi o (Prx)^ 1 and ir o (Pry) -1 = v. 

The space of L^-invariant continuous functions Vl is a vector subspace of the 
space Cb, so there exists factor space Wl = Cb/VL of functions with the property: 
J geG {u o L~ g 1 )dx(g) = 0, where dx(g) is normalized Haar measure on G. It is not 
hard to verify, that Cb = Vl © Wl , and we can uniquely decompose every function 
u from Cb {X x Y) into the sum of a Lfj-invariant function u from Vl and a function 
u from Wl'- 

u = u + u 

It is clear, that u — J geG (u o L~ g 1 )dx(g) — Ptv l an( i ^ i s a continuous projec- 
tion of u onto Vl ■ 

In the theorem below we generalize the well-known Kantorovich duality for the 
case of G-invariant constraints. 

Theorem 5.1. In the setting described above the following identity holds: 



inf / c(x,y)dn = sup / (f>(x)dfi + / ip(y)dis 
xenninv L J XxY \Jx Jy 

Proof. The proof is based on the Fenchel-Rockafellar duality theorem. 

Theorem 5.2. (Fenchel-Rockafellar duality) Let X be a Polish space and letQ and 
Q be convex functionals from Cb(X) (olU {+oo}. Assume that is continuous 
at some point. Let Q* and tt* be the Legendre-Fenchel transforms of and f2 
considered on the space of Radon measures M := (Cb)* on X and defined as follows: 

Q*(-k)— sup I / udn — 0(u)j, f2*(7r) = sup I / udn — Q(u) I . 
uec b \Jx J u£C b \Jx ) 

Then the following Kantorovich-type duality holds: 

(5) inf (e(u) + n(u)) = sup (-e*(-7r*) - n*(Tr*)) 
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Let 

if u(x,y) > -c(x,y) 
+00 else 



e(u) = 



Jx <t>(x)dn + J Y ^{y)dv if u(x, y) = <p(x) + ip(y) + lu(x, y) 
+00 else 



where ui(x,y) is a function from Wl(X X Y). It can be checked that they are 
both convex and O are continuous at u = 0. Let us find their Legendre-Fenchel 
transforms: 



0*(-tt) = sup I - / u(x 1 y)dn] u(x,y) > -c(x,y) 

u£C b (XxY) \ JXxY 

If the measure ir is not nonnegative, then there exists v E Cb(X x Y) such that 
J vdir > 0. Then we can choose u = Xv, A — > 00 and see, that the supremum of 
our functional is +00. In the other, case when n is nonnegative, it's clearly that 
supremum is J edir. So: 



6*(-tt) 



J XxY c ( x > y) d7T [i7Te M+ ( x x Y ) 

+00 else 



Q*(n) — sup I / u(x 7 y)dn — / <f)(x)d^L — \ ^{y)dv, 

u<EC b (XxY)^JxxY JX JY 

u(x, y) = 4>(x) + %j){y) + u)(x, y) 



= sup I / (tp + 4>)dn + / Lb(x,y)dir — / 4>(x)dfi — / ip{y)dv 
ip,<j>,& \JxxY JXxY Jx Jy 

If 7r ^ II, then there exist 4>i,ipi £ Cb(X x Y), Cji e Wl(X x Y) such that 
IxxY W*i + <fii)dir — J <j)i{x)d^ — J y ipi(y)dv > and f XxY uiidir > 0. Then the 
choice = A0i, ?/> = A^i A — > 00 shows that the supremum of 57* is +00. Similarly, 
if 7r e IT, but tt Iuvl, then there exists uji <E Wl(X xY) such that J XxY ujidn > 
and the other terms vanish. Again the choice w = Awi, A —> 00 shows, that the 
supremum is +00. Obviously, in the last case: 7r £ LI n Iuvl the supremum of O* 
is 0. Thus: 

[0 ifvrenn/^ 

l+oo else. 
Calculate the right-hand side of ([5]): 

sup (-6*(-7T*) - 0*(tt*)) = sup [ -c(x,y)dTT = 

= — inf / c(x,y)dir. 

7rennZni;t J XxY 
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In the left-hand side of the duality statement we have 
inf (6(u) + Q(u)) = 

= M-A I < P( x ) d L l + \ i>{v)dv\ 4>(x)+ip(y)+u)(x,y)>-c(x,y)) = 



= - sup I / (f){x)dn+ / -ip(y)du; <j>{x) + tp(y) + u(x, y) < c(x, y) 

<p,ip,ui \Jx JY 

We are going to use the fact that [i and v are invariant measures under the actions 
L x and L Y of the group G. This implies that J x cj)(x)dfi+ J Y ip(y)dv = J„ cj)(x)d^+ 
J Y $(y)dv. So, we get 

- sup I / (j)(x)dfi+ / Tp(y)dv; <j>(x) + ^(y) + $ + ip + ui < c + c 

<f>,^),C) \Jx JY 



= — sup I / (f>(x)dfj,+ / il>(y)dv; <f>[x) + ip(y) < c + (c — Gj — 4> — ip) 

(f), 7 (j> 7 ij> ,cj \J X JY 

Note that the maximizing functional does not depend on the 14^-parts of 4> and tp, 
thus we can choose <f> and if) arbitrary. Hence c: = c — Cj — (f> — ip is just an arbitrary 
function from Wl- The inequality 4>{x) + ip(y) < c + c holds pointwise, so we can 
act on it by any element g G G: 

{4>o L g )(x) + (ipo L g )(y) < (coL g + coL g )(x,y) <^=> 

4>{x) + 4>{y) < (c + c o L g ) (x, y) 

for any i.jelxY. So: 

cf>(x) + ip(y) < (c + c)(x,y) <^=> <f>{x) + t[j(y) < fc + jnf (co £ ff )Vx, y) 

It follows immediately from the definition of that inf ge G (co L g ) < 0. Hence 
the supremum is reached at c = (ib = c, <p = <p, ip = ip). Finally, we get: 

- sup I / 4>(x)dfx + \ l tp(y)dv; </>(x) + ip(y) < c + c 
4>,4>ev L ,cew L \Jx JY 

— sup I / (j>(x)dp, + / tf)(y)dv 
4>+4><c \J x jy 

Collecting everything together we get from ([5]) the required statement. □ 

6. Exchangeable measures 

In this section we discuss the mass transportation of exchangeable measures. 
Recall that a probability measure is exchangeable if it is invariant with respect to 
any permutation of finite number of coordinates. Before we consider K°°, let us 
make some remarks on the finite-dimensional case. 

Let Sd be the group of all permutations of {1, • • • ,d}. This group acts on R d as 
follows: 

L a (x) = {x^ X ),X a( 2), ■ ■ ■ ,X<j(d)), cr £ Sd- 

Let r C Sd be any subgroup which acts transitively. The latter means that for 
every couple i,j there exists a 6 T such that a(i) = j. 
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Assume that the source and image measures are both invariant with respect to 
r. In this case the Kantorivich potential (p is also L-invariant: ip = ip o L a for 
any a G L. Consequently, the optimal transportation T — \7<p has the following 
property: 

T = L* a (T o L a ) = L^ 1 oT o L a . 

Equivalently, 

L a oT = ToL a . 

The action of T can be extended to M. d x R d as follows: L a (x, y) — (L a x, L a y). The 
optimal transportation plan ir(dx,dy) is also L-invariant. 
Now let a(i) = j. One has 

jxiTid[i = J (ei,x)(ei,T(x)} dfj, = J (L cr e i ,L a x){ei,L* CT (T(L cr x)) d^i 

= J \ej,L a x){ej,(T(L a x)) dfi = J {ej,x){ej,T(x)) dfi = J xjTj d/j,. 
Consequently, 

Wi(fi,v)= J\\x-T(x)\\ 2 dfi = J2 f(xi-Ti(x)) 2 dfi = d [(xi-Tiix)) 2 d/i, V*. 

i— 1 

Conclusion: The quadratic Monge-Kantorovich problem for V -invariant marginals 
is equivalent to the transportation problem for the cost \x\ — yi\ 2 restricted to the 
set of T -invariant probability measures. 

We denote by Soo be the group of permutation N which permutes a finite number 
of coordinates. We consider its natural action on R°° defined by 

a(x) = x^ix^eM. 00 , crG Sco- 

rn this section we consider measures \i and v which are invariant with respect to 
any a G S^: 

/i = pa _1 , v = v o cr _1 . 
The measures of this type are called exchangeable. 

Example 6.1. Let to be a Borel probability measure on R. Its countable power to 00 
is an exchangeable measure on 

The conclusion made above helps us to give a variational meaning to the trans- 
portation problem in the infinite-dimensional case. 

Definition 6.2. Let \i and v be exchangeable. Consider the set Vs^ of probability 
measures on X x Y, X = Y = M°° which are invariant with respect to any mapping 
(x,y) — > (L a (x), L a (y)), a G and have fixed -invariant marginals \x and v. 
We say that a measure 7r G Vs^ is a solution to the quadratic Monge-Kantorovich 
problem if it gives the minimum to the functional 



(6) V Sx 3 m -»• J ( Xl - yi ) 



2 dm. 



Assume that there exists a measurable mapping T : X — > Y such that m({(x, T(x))}) = 
1. Then we say that T is a optimal transportation of fj, onto v. 
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Clearly, a solution to the Monge-Kantorovich problem © exists provided J x\ d[i < 
oo, J y\ dv < co. The corresponding optimal mapping T (if exists) must commute 
with any L a . This means that for /i- almost all x 

(7) ToL a {x) = L a oT(x). 

The set of exchangeable measures is described in the following generalization of 
the classical De Finetti theorem (see, e.g., [3], Theorem 10.10.19). 

Theorem 6.3. Let V be the space of Borel probability measures on K equipped 
with the weak topology. Then for every Borel exchangeable fi on M°° there exists a 
probability measure II on V such that 

n{B) = J m°°(B)Tl(dm), 

for every Borel B C K°° . 

The structure of mappings satisfying J7J is easy to describe. Assume first 
that (j, is a product measure. Consider the function Ti(x) — (T(x),ei) and fix 
the first coordinate x\. Then the function F: (x 2 ,X3, ■ ■ ■) — > Ti(x) is invariant 
with respect to Soo. Hence F is constant according by the Hewitt-Sawage 0—1 
law. Thus Ti(x) = Ti{x\) depends on X\ only and the mapping T is diagonal: 
(T 1 (x 1 ),T 2 (x 2 ),- ■ ■)■ 

Example 6.4. Let /zi, \i 2 be countable powers of two different one-dimensional mea- 
sures. By the Kakutani dichotomy theorem they are mutually singular. There is 
no any mass transportation T of /i = fix onto v = +M2) satisfying ([7]). Indeed, 
any T satisfying ([7]) must be diagonal, hence the measure /ioT" 1 must be a product 
measure. 

Thus, we see that the optimal transportation does not always exist. Let us 
find sufficient conditions for the existence. We consider a couple of exchangeable 
measures /1, v and their mixture representations: 

/! = J m°° rfII M , v = J m°° dH u . 

By the strong law of large numbers, for any Borel function / one has for m°°-almost 
any x 

f dm = lim - (f(xi) H 1- f(x n )) . 

n n 

Let us choose a sequence of bounded continuous functions {fi} on R which is dense 
in C([a, b]) for any a, b and set 

00 1 f 
S m = Pl^ 1 lim-^^i) H \- fi{x n )) = fi dm}. 

i—l' 

Clearly, m°°{S m ) = 1, but p°°(S m ) = for p ^ m. 
For any couple of measures m J° , we set 

d 2 {m^,m^) := W 2 2 (m u m 2 ). 

Recall that W 2 {mi,m 2 ) is the squared Kantorovich distance between m\,m 2 . 

Let T be a transportation mapping which 1) transforms [i onto 2) commutes 
with S^, 3) gives minimum to the functional J \T\(x) — x\\ 2 d[i among of the 
mappings satisfying 1) and 2). It follows from the considerations above that T 
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must be diagonal on any set S m (up to a m°°-measure zero and for Il^-almost all 
to). This means that for n^-almost all m one has 

T\s m = T m ,F{m)) m°°— a.e., 

where F: P(R) — > P(R) is a Borel mapping and T mi pt m \ is the diagonal optimal 
transportation of m°° onto F°°(m). Computing the transportation cost 

J (Ti-Xil 2 d(jt = J W$(m,F(m))dn. IM = J d 2 (m°°, F°°(m))dn M 

we get that F must be a optimal transportation of II M onto n„ for the cost function 
(m,p) = W£{m,p). 

Taking into account all the remarks above, we arrive at the following conclusion. 

Proposition 6.5. Let the mapping T satisfy the following assumptions 

1) v = fioT- 1 , 

2) L a oT = ToL a [i—a.e., 

3) J \Ti(x) — x\\ 2 dp: is minimal among all mappings satisfying 1) and 2). 
Then up to a set of p-zero measure T\s m = T mtF / m ), where F is a optimal trans- 
portation ofH^ onto IT„ for the cost function (m,p) —> W% (m,p) and T m ^p( m ) is 
the diagonal optimal transportation of m°° onto F°° (m) . 

Remark 6.6. The mapping F can be considered as a kind of "factorization" of T. 
Of course, as we have already seen, T and F do not always exist. 

Proposition 16.51 does not give, however, any checkable sufficient conditions for 
the existence of a optimal transportation. The questions, whether the mapping T 
from Proposition 16.51 can be approximated by finite-dimensional optimal mappings 
seems to be non-trivial. In the rest of the section we give some constructive sufficient 
conditions for the existence of the optimal transportation of exchangeable measures, 
where the transportation is understood (as before) as a limit of finite-dimensional 
approximations (see Definition [L6j) , 

Recall that the projection /i o P" 1 of /x onto the first n coordinates is denoted 
by fx n . For a couple of numbers m < n we denote by P m ^ n the projection onto the 
subspace generated by {e m+ i, • ■ ■ , e„} and by /i m>n = (i o P^ n the corresponding 
image of \x 

It will be assumed in the rest of this section that for any couple of numbers 
n > to the projection /z„ is absolutely continuous with respect to \x n x fx mtV ,, hence 
there exists a representation 

(S) l^n Prn^n ' f^m ^ Mm,n 

and the same holds for v: 

(9) l^n — d m ^ n • V ra X v ra n . 

Theorem 6.7. Assume that 

1) the measures /i and v are exchangeable; 

2) all the projections fi n , v n admit Lebesgue densities and the representation 

3) there exists a sequence C m > 0, e > such that for every to, n S N one has 



J \ogp m , n dn < C m , J d m ^+ e) dv < C„ 
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and 

lim ^i = 0; 

m— >oo m 

4) i/ie measure v is K -uniformly log-concave for some K > 0. 
T7ien t/iere exists a optimal transportation mapping T transforming /i onto v. 

Remark 6.8. The assumption 3) means, in particular, that the Kullback-Leibler 
distance between fi n and p n x p m xn is o(m) uniformly in n. 

Proof. Let T m be the optimal transportation mapping transforming fj, m onto v m 
and T m ^ n be the optimal transportation mapping transforming ii m ,n onto v m ,n- 
Clearly, 

maps /i TO x /U m ,„ onto z/ m x ^ m ,w Using the representation 2/ n = d m . n ■ v rn x f m ,n, 
we get that T m ^ n transforms 

dm,n(c^m : n) ' Prn X Pra,n 

onto V n . 

By Proposition 12.21 one has the following estimate: 

En V^(Tm,n)MmXMm,„( 7fi IT " ) - "o" / 11^™." ^« • 

Pm,n\-L m,n } l^m X f-lrn,n J 

This implies that 

/logf , P 7 L " - ~o~ / || r m--Pm ^n|| 2 dpn = TT / II ~ ^ m Tn II ^ dp,. 

J y d m ,n{T m: n) J J 1 J 

Let us estimate the left-hand side of 

logf Pm ? — -) dp, n = / log/9 m) „ dp - I \ogd m ^ n (f„ hn ) d/j, n . 

The desired estimate for the first term follows immediately from the assumptions 
that J log p m ,n d[i < C. Let us apply the inequality 



xy < ~(e BX + 2/ l°gy ~y), 



£ 

which holds true for any x and any y > 0, e > 0. One has 



fog^m,n(^m.n) dp n / log G^jn.n (Tm^n^Pm.n d^l rn X p m ^ n 

dpra X pm.n "t - / (pm.n fogPm,n Pm,n) dpm X ^m,rij 



1 / /" 1 
< - 



— c^ m x f m ,n + / (logp m! „ - 1) dun) 

n,n J 

dv n + / (log p mtn - 1) dfj\ . 

n,n J 

= ~(J jT+F dv + J ( lo S ~ X ) • 

Applying assumption 3) we finally obtain that there exists a sequence c m — o(m) 
such that 

K f,, ,|2 K [,, m2 

y y||T m -P m oT„|| d Mn = y y ||T m -P m oT n || d M <e ro . 
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Since p and v are exchangeable, the same holds for all projections p n ,v n . This 
implies, in particular, that 

J \\T m ~ P m ° P n \\ 2 dfj, = m J {T m - T n , e,} 2 dfi 

for every 1 < i < m and n > m. Hence J {T m — T n , e,} 2 dp < ^g-. Passing to a 
i 2 (/Lt)-weakly convergent subsequence (P nk , e,) — > Tj we obtain in the limit 

({T m ,ei) - Tif dp < 



mK 

Clearly, this gives \im m (T m ,ei) — Pi in P 2 {p). It follows from Lemma 11.31 that 
T = ESi Pi ' e i i s the desired mapping. □ 

All assumptions of Theorem 16.71 are easy to check, excepting 3). Let us give a 
simple example, where it can be checked. 

Example 6.9. Let p be a finite convex combination of measures which are countable 
products of measures with Lebesgue densities. Then there exists C > such that 
sup„ J log p m ^ n dp < C. 

Proof. Let p on R°° be a finite convex combination of product measures (finite 
mixture). This means that it has the form p = ^ XiPi, Y] Xj = 1, where every pi 
is a product measure. The same holds for every projection 



Set: 



Pi 



^oP,- 1 =^A i ( Mi oP- 1 ). 

= d/goP- 1 = dp, o P~] n 
dx ' dx 



Then p m , n — dfl can be expressed as 



E X *P*<li 



(E A ^)(E A j<7j)' 

We claim that J log p m ,ndp < C for some constant C independent of n, m. Indeed, 

_ J^^Pili _ J2MPiQi < J^^iPilt 



' (E^pi)(E A j?i) E-^ /'<</< ■ E,. , A, A, /<,</, Y, x lPiQi' 
Using the trivial estimate 

E A »P»g» < 1 

we finally get 

\ogp m , n dp = / log -dp, < 

J (EAiPi)(EAj?j) 

- / lo s ■ <■ n \ d L L = ~ lo 8' inf (Ai) < c. 
7 mfi(Ai) » 

Let us remark that this argument works well only with finite mixtures. □ 
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7. Stationary Gibbs measures 

In this section we study stationary Gibbs measures. Unlike the previous section 
we identify our space 1R°° not with M N , but with M z . This means that we allow 
negative coordinate indices: xq, X-i, X-2, 

Set 

E n = spanje^, — n < i < n} 

and 

E m .n = spanje^, e^, — n < i < —m, m < j < n}. 

The corresponding orthogonal projections will be denoted by P„, P m ,n accordingly. 
Let a n : E n — > E n be the cyclical shift 

0" n • (^-ni ^-(n-l)i ' * ' i 1 ) ^0 ; ; * ' * > Xn) ^ (^n , £— n^^— ( n ~ 1):'*' 5^ — 1; ; ' ' ' ; Xn— 1 ) • 

In the limit n — > oo we obtain the standard (Bernoulli) shift 

a: x — (xi) -)■ (a;j_i). 

Definition 7.1. A probability measure p is called stationary if it is invariant with 
respect to er: p o a^ 1 = p. 

Throughout the section the following assumptions hold. 

1) The measure p is a weak limit 

p = \imp n , 

n 

where every p n is a a„-invariant measure on E n with everywhere positive 
Lebesgue density and finite second moments; 

2) For every m < n there exists a probability measure p m ^ n on E m n such that 
the relative entropy (the Kullback-Leibler distance) between p m x p m ^ n and 
p n is uniformly bounded in n: 



J Vrf(Aim X p m . n )J 



with C m satisfying 

lim — — = 0: 

m ra 

3) For every power I of the cyclical shift a m the measure p has density with 
respect to ^ o (cr'„) _1 . Moreover, 



dpo(a\ n ) i 

satisfies sup m ; ||e" m ' ! H^+s^) < oo for some S > 0; 
4) Every p n is absolutely continuous with respect to /lof^ 1 : = p n -poP~ x 
and, in addition, 

sup / p^ s dp < oo 

n J 

for some S > 0. 

Remark 7.2. We note that 1) + 4) imply convergence of {^ n } in a stronger sense. 
Namely, lim n / ip dp n — J ip dp for every cylindrical ip 6 L 2 (p). Indeed, take a 
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continuous bounded cylindrical function (p such that \\tp — <^||l 2 (/i) < e - One nas 
lim„ J ip dp n = lim n J(ip — (p) dp n + J tp dfi. The claim follows from the estimate 

( J \<p-&\ dfJ-n^) < j (<p - -pf dfi ■ J p 2 n dp, < (sup J p 2 n dp)e 2 

Remark 7.3. The idea of the proof of Theorem l7.4l below is the same as in Theorem 
16.71 Applying the Talagrand-type inequality from Proposition 12.21 and the sym- 
metric properties of measures we show L 2 -convergence of the finite-dimensional 
approximations. The proof of Theorem l7.4l is much longer because we need to over- 
come the following technical difficulty: the projections of a stationary measure are 
not invariant with respect to cyclical shifts. Thus the situation is different to the 
exchangeable case, where we have stability under projections. 

Theorem 7.4. Let p be a probability measure satisfying Assumptions 1 )-4 )■ Then 
there exists a optimal transportation mapping transforming p onto the standard 
Gaussian measure on M°° . 

Proof. Let us consider the n-dimensional optimal transportation T n transforming 
p n onto the standard rt-dimcnsional Gaussian measure j n . It follows from the 
cr n -invariance of p n and "f n that the mapping T n is cyclically invariant: 

(T n oa n ,ei) = (T n , ei_i), p n - a.e. 

(with the convention e n +i = e_„, e_„_i = e„). 

Let us fix a couple of numbers m, n with n > m. Let T m%n be the optimal 
transportation mapping transforming p m , n onto the standard Gaussian measure on 
Em,n- We stress that T m and T m ^ n depend on different collections of coordinates. 

We extend T m onto 1™ in the following way: 

^m(^) — ^rn(-^ > m^) ^771,71(^771,71^)- 

Clearly, T m maps p m x p m ^ n onto the standard Gaussian measure on E n . Applying 
Proposition 12 . 21 to the couple of mappings T m ,T n , we get 

(10) ~ / ||T„ - T m \\ 2 dp n < [ log( diln - W. 

£ J J ^Oi\pm X Pm,n) ' 

Thus 

m „ 

(11) I - T ™' e *> 2 d ^ ^ Cm 

■i——m 

for every m,n, m < n. 

Let us note that for every i one can extract a weakly convergent subsequence 
from a sequence of (signed) measures {{T„,ei) ■ p n }- Indeed, for any compact set 
K 

(J k \{T n ,ei}\d» n y < J \{T n ,e t )\ 2 dp n -p n (K c ) = J x\ d 1 ■ p n (K c ). 

Using the tightness of {p n } we get that {|(T n ,ej)| • p n } is a tight sequence. In 
addition, note that for every continuous / 

lim^y f\(T n ,ei)\d(i n ^ < J x 2 d-f ■ J f 2 dp. 

This implies that any limiting point of {(T„, a) ■ p n } is absolutely continuous with 
respect to p. Applying the diagonal method and passing to a subsequence one can 
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assume that convergence takes place for all i simultaneously. Consequently, there 
exists a subsequence {nk} and a measurable mapping T with values in R°° such 
that 

{T nk ,ei) ■ n nk -» (T,ei) ■ fi 
weakly in the sense of measures for every i. It is easy to check that the standard 
property of L 2 -weak convergence holds also in this case: 



X{ d'y = 1 



(12) J (T, ei ) 2 dfi < lim fc f {T nk , e,) 2 ^„ = J 

Finally, we pass to the limit in (jlll) (here we apply (|12p and Remark \7.2\i and 
obtain 

(13) j(T-T m ,e t ) 2 dn<C m . 

i——m 

Note that T commutes with the shift a: {Too, e^) = (T, e^-i). Indeed, for every 
bounded cylindrical ip one has 

ip(T n ,ei-i)dfi n = / p(T n (o n ),ei)diJ, n = / ip(a~ 1 )(T n ,ei)dn n = / <y3(CT _1 )(T„, ei)dfi n . 



Here we use that <p(a n 1 ) = <^((x x ) for sufficiently large values of n and the cyclical 
invariance of T n . Passing to the limit in the n^-subsequence one gets 

ip(T,e i - 1 )diJ,= ip(a~ l )(T,ei}diJ, = / <p(T o <r, e^dfi. 



Hence T o a = a o T . 

Using invariances, we get that 



J (T - T m , a) 2 dfj, = J{T o a 1 - T m o a l m , e i+ i) 2 d^i 

= f (T o a 1 o (a'j- 1 - T m ,e l+l ) 2 e- u ^'dfi 



(for i,i + l G [—m, to]). Applying assumption 3) we get by the Cauchy-Bunyakovski 
inequality 

(14) C J(T-T m , ei ) 2 d[i> Je u ^ dfi J (T o a 1 o (a'j- 1 - T m , e l+l ) 2 e~ u ^> dfi 

> (J \(Toa l oialy 1 -T m ,e i+l )\d»y '. 

We note that lim m a 1 o (a l m )^ 1 {x) — x for every x. Let us show that 

(TWo^LrVi) (T,ei) 

in L 1 ^). To this end take a continuous bounded function Ti such that J |Tj — 
(T,e 4 }| 2 < e 2 . Then 



\{Toa l o{a l m )- x -T, ei )\ d/x < j |T< o cx< o (a^)" 1 - 2}| d/i 

+ / |(Toa i o( ( 7j n )- 1 ,e i )-f i o ( 7'o( C r^)-VM+ / \(T, e t ) - f,\ dfi. 



We will show that the first integral in the right-hand side tends to zero as m — > oo 
and the others are small (of the order e). We prove the first statement only, for 
the second one the arguments are similar. Note that lim m Tj o a 1 o (c^) -1 = Ti 
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poinwise. It is sufficient to show that sup m J |T, o a 1 o (a m ) 1 dfi < oo for some 
e > 0. 

One easily gets the desired estimate: 

f i ao l o( a l m )- 1 \ 1 +*dn = / |T < o ff l | 1 ^e^.'dA*<||f < |U 3M || e "».'|| £ »/P-.) M <C 

where 2/(1— e) = 2 + 5 and C is independent of m,. 

Using convergence (T o a 1 o (ct^) -1 , e&) — > (T,ek) in we get from ([14]) and 

(fT3|) that for some C independent of to, Z 



C • lim m J (T - T m , e;) 2 dfi > lim m ( / |(T - 2m, e i+ ;}|d//^ . 
In particular, fixing some io and setting I = i Q — i, we get 

C C m > C ■ lim m ^ / (T - T m , e,} 2 > lhn m TO • ( / \(T - T m , e io )\dfi 

i— — m 

Applying the relation lim m ^ = 0, we get that lim^ J |(T — T m , ei )\d^j = for 

every Iq. Hence one can extract a subsequence such that (T mk ,ei) — > (T, e*) 
in i 1 (^t) for every i (in what follows we denote this subsequence again by {T n }). 
Moreover, it follows from (fTT|) that one has convergence in L 2 ~ £ (fi) for every e > 0. 

It remains to show that T maps fi onto 7. Fix a smooth Lipschitz function / 
which depends on a finite number of coordinates (x_fc, • • • , %k)- By the change of 
variables formula one has for every n > k 

f(T n ) dfi„ = J f dy. 

In the other hand, let us approximate P^T by a mapping T £ : R°° — > Pfe(IR 00 ) such 
that || Pfe(T) — Tell £(2+s)* < e and every (T e , e^} is smooth and bounded for every 
— k < j < k. One has 

I / f(T n ) dfl n - J f(f e ) dfl n \ < WfUipf \\P k (T n )-f s \\ diln 

< C\\f\\ Lip (\\P k (T n - T)\\ L p +f) . M + \\P k (T) - f £ \\ L(2+sr(p) ), 

whereC = sup„ ||Pn||z,(2+<0* ( M )> Pn = t^t - Using L( 2 + 5 )* (/i)-convergence (T n ,ej) - 

(T, ei), smoothness of f{T £ ), and the weak convergence \i n — > /z, we pass to the lim- 
sup in the inequality. We obtain 

lim"„| / f(T n ) dfi n - f f(f s ) dfi\ < eC\\f\ 



I Lip- 

Choosing an appropriate sequence T e — > T we get J f(T) d\i = lim„ J f(T n ) d\i n = 
J f d"f. The proof is complete. □ 

Below we apply Theorem 17.41 to Gibbs measures. We study a transportation of 
a Gibbs measure /1 which can be formally written in the form 

(1 = e- H{x) dx, 
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where the potential H admits the following heuristic representation: 

+00 +00 
H(x)= V(xi)+ W(x l: x l+1 ). 

i— — 00 i— — 00 

Here V and W are smooth functions and W(x, y) is symmetric: W(x, y) — W(y, x). 
The existence of such measures was proved in [2]. 

Let us specify the assumptions about V and W below. These are a particular 
case of assumptions A1-A3 from [2J. 

1) 

W(x,y) = W(y,x); 

2) There exist numbers J > 0, L > 1, N > 2, a > 0, and A, £?,C > such 
that 

|W(ar,y)| < J(l + \x\ + \y\) N -\ \d x W(x,y)\ < J(l + |x| + lyl)"' 1 

3) 

|y(«)l<^(l+N) L . |^'(x)| <C(l + | a; |) i - 1 ; 
4) (coercitivity assumption) 

V'(x)-x> A\x\ N+a -B. 

Let us define the following probability measure on E n : 

1 " 

= ~^ CX P\ X! + W(x u X l+1 ))Y 

i=—n 

with the convention a; n +i :— X- n . Here Z„ is the normalizing constant. 

Proposition 7.5. The sequence fj, n admits a weakly convergent subsequence /i„ fe — > 
/i satisfying the assumptions of Theorem \7.4\ 

Proof. It was proved in Theorem 3.1 of [2] that any sequence of probability measures 

fi n c n e dx—ji • • • dx n , 
where H n is obtained from H by fixing a boundary condition x 

n n — 1 

H n = ^ V ( X i) + X! W(Xi,X i+1 ) + W(x_( n+1 ),X- n ) + W(x n ,x n+1 ), 
i=—n i—~n 

has a weakly convergent subsequence fi nk — > fi. In addition (see [2J), // satisfies the 
following a priori estimate: for every A > 



sup / exp(\\xk\ N ) dfi < 00. 
kezJ 



The same estimate holds for fj, n uniformly in n. 

Following the reasoning from [2] it is easy to show that the sequence {fi n } is tight 
and satisfies the same a priori estimate. Thus, we can pass to a subsequence 
which weakly converges to a measure fi. For the sake of simplicity this subsequence 
will be denoted by {fJ- n } again. The limiting measure fi satisfies 

(15) sup / exp(A|a;fe| Ar ) dfj, < 00, 

keiJ 
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moreover, 

(16) sup sup / exp(A|a;fc| Ar ) dfi n < oo. 

n keZJ 

Let us estimate the relative entropy. We note that /j, n and \x m are related in the 
following way: 

J e dfl n 

where Z = -W(x m , X- m ) + W(x m ,x m+1 ) + W(x- m -i, X- m ), and v m<n is a prob- 
ability measure on E m , n . Set: [i m ,n = v m>n . Then 

/ lo sf"77 — ~ r)dfJLn = [ (Z -log j e z dfi n ) d\i n 

J \d(Li m x fJ, m , n )J J J 

The desired bound follows immediately from (fl6|) and the assumptions about W. 
Take a cyclical shift shift a m and a power I. For every /i„ with n > m + l (hence 



for the limit fj.) the measure — , I < m, has a density e Um ' ! with 



Z \-l 

u m ,i = W(x m _i + i,x m _i + W(x_ m ,x_ m+ i) + W(x m -i + i,x m _i) 
- W(x m+1 ,x m ^i) - W(x_ m ,x m ) - W(x m _ z+ i,a;_ m _i). 



Assumption 3) follows from the same bounds on W and ([16 
In order to prove assumption 4) we note that 



e W(x„,x n+1 )+W(x- n -i,x- n ) , „ 



o _P WCa;„ x „1 



j- e W(x„,i„ +1 ) + »'(x_„_,,i_ n ) dM - | e VV(x n ,x_ n ) d/in - 

The normalizing constants can be easily estimated with the help of a priori bounds 
for [i and /i n . Applying assumptions on W one can easily get that 

d{i o P„ 1 

where Ci, C% do not depend on n. Hence, assumption 4) follows immediately from 

CE5). □ 
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